Generative Lexicon Principles for MachineTranslation : A Case for Meta - Lexical
نویسنده
چکیده
This paper addresses two types of mismatches in the translation of reported speech between German and English. The rst mismatch is between the repeated use of the reported speech construction in English and the use of subjunctive in German used to indicate continued attribution. The second mismatch concerns the diierence in usage of metonymic extensions in the subject position of reported speech. Examples show the diierent styles of reporting the utterances of somebody else. A well-structured lexicon is presented as one step to the solution of the problems presented. One key feature of the proposed lexicon is a meta-lexical organization of basic word entries, which is shown to facilitate the translation process. We contrast our notions of lexical structure with diierent recent proposals in machine translation. 1. Motivation One traditional source of information about the world, the newspaper, is adjusting to the need for faster dissemination of news by providing on-line services. There are two tasks: there is too much information, thus information needs to be extracted as much as possible automatically. This problem will not be addressed in this paper and, once solved, might reduce the urgency (but not the need) for the second task, namely making information presented in diierent languages available. This second problem should be attacked as much as possible automatically, as well. Newspaper articles have not found separate discussion to date in computational linguistics, including machine translation. The particular style adhered to in newspaper articles is, however large, a sublanguage. This paper addresses some aspects of newspaper style and the problems this poses for machine translation. The most striking feature of the North American newspaper style is the frequency of the use of reported speech. This style is conventionalized rather than purely functional , as can be seen from the diierent styles observed in diierent countries. We will limit our attention here to English and German. Even for such closely related languages and speakers the diierences are important enough to be considered in detail. The stylistic diierences we will consider in this context are limited to the use of subjunctive and of metonymic extensions. This paper aims to exemplify rather than describe exhaustively the diierent possible styles. The point is to present a well-structured lexicon as one step to the
منابع مشابه
The Generative Lexicon
In this paper, I will discuss four major topics relating to current research in lexical semantics: methodology, descriptive coverage, adequacy of the representation, and the computational usefulness of representations. In addressing these issues, I will discuss what I think are some of the central problems facing the lexical semantics community, and suggest ways of best approaching these issues...
متن کاملInferring a Semantically Annotated Generative French Lexicon from an Italian Lexical Resource
The paper reports on the development and experimental implementation of a combined methodology of knowledge transfer from an existing lexical resource with a view to semiautomatically inducing a new lexicon, in a cross-language environment. The source resource, a multi-layered Italian lexicon whose theoretical approach to the content and representation of semantic information follows the genera...
متن کاملThe semi-generative lexicon: limits on lexical productivity
This paper provides an overview of several different classes of lexical semi-productivity and discusses a general approach to constraining generative devices.
متن کاملT&F Proofs: Not For Distribution
A central assumption in generative grammar research on the relationship between syntax and the lexicon is that syntax is a projection of the lexicon. The structure of sentences is a refl ection of the lexical properties of the individual lexical items they contain. In the standard view, each lexical item is associated with a lexical entry that contains three kinds of information, as indicated i...
متن کاملA Tool for Multi-Word Expression Extraction in Modern Greek Using Syntactic Parsing
This paper presents a tool for extracting multi-word expressions from corpora in Modern Greek, which is used together with a parallel concordancer to augment the lexicon of a rule-based machinetranslation system. The tool is part of a larger extraction system that relies, in turn, on a multilingual parser developed over the past decade in our laboratory. The paper reviews the various NLP module...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995